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j  SUMMARY 

V 

Bayesian  solutions  of  tracking  problems  that  involve  measurement  associar 
tion  uncertainty,  give  rise  to  Gaussian  mixture  distributions.,  which  are  composed 
of  an  ever  increasing  number  of  components.  To  implement  such  a  tracking  filter, 
the  growth  of  components  must  be  controlled  by  approximating  the  mixture  distri¬ 
bution.  a  popular  and  economical  scheme  is  the  Probabilistic  Data  Association 
Filter  (PDAF5 ,  which  reduces  the  mixture  to  a  single  Gaussian  component  at  each 
time  step.  However,  this  approximation  may  destroy  valuable  information, 
especially  if  several  significant,  well  spaced  components  are  present. 

In  this  Report) two  new  algorithms  for  reducing  Gaussian  mixture  distribu¬ 
tions  are  presented.  These  techniques  preserve  the  mean  and  covariance  of  the 
original  mixture,  and  the  final  approximation  is  Itself  a  Gaussian  mixture.  The 
reduction  is  achieved  by  successively  merging  components  or  groups  of  components. 
The  two  algorithms  have  been  used  to  control  che  growth  of  components  which 
occurs  with  the  solution  to  the  problem  of  tracking  a  single  object,  in  the  pre¬ 
sence  of  uniformly  distributed  false  weasuresents.  Simulation  results  are  pre¬ 
sented  which  compare  the  performance  of  the  resulting  tracking  filters  and  the  PCAF. 
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g?  1  INTRODUCTION 

I* 

j^;  A  tracking  filter  is  an  algorithm  for  estimating  the  position  (and  possibly 

!?:■  also  the  velocity  or  other  factors)  of  an  object  from  measurements  of  a  sensor 

k  such  as  a  radar.  In  general  the  sensor  will  produce  measurements  from  the 

required  object  and  also  from  random  noise  interference,  clutter  and  other 
y  objects^.  Usually  it  is  not  possible  to  distinguish  with  certainty  between  the 

r  useful  measurements  from  the  object  and  other  unwanted  measurements.  In  these 

circumstances  the  computational  requirements  of  the  full  Bayesian  solution  to 
this  problem  rapidly  increase  as  tracking  proceeds.  This  Report  is  concerned 
with  methods  for  containing  the  computational  requirements  within  specified 
bounds,  while  minimizing  the  consequent  performance  penalty. 

It  is  usual  practice  for  the  computational  demands  of  the  tracking  filter 
to  be  controlled  in  two  stages  on  every  occasion  that  measurements  are  received 
from  the  sensor.  The  first  of  these  is  a  coarse  acceptance  test  which  is  applied 
before  new  measurements  are  processed,  while  the  second  stage  is  applied  after 
processing.  The  acceptance  test  is  effectively  a  tracking  gate  which  rejects  any 
measurements  which  are  very  unlikely  to  originate  from  the  object  of  interest, 
and  since  it  is  applied  before  processing  it  is  computationally  inexpensive. 

This  type  of  test  is  well  known  (see  Refs  2,  9  and  11)  and  is  widely  applied  to 
measurement  association  problems  where  ambiguities  may  exist.  Therefore  the 
acceptance  test  will  not  be  considered  further  in  the  main  text  of  this  Report, 
although  its  application  to  the  simulation  example  is  described  in  Appendix  D. 
After  processing  of  the  accepted  measurements  is  complete,  it  may  be  necessary 
to  approximate  the  solution  to  avoid  an  excessive  computational  load  when  sub¬ 
sequent  measurements  are  incorporated.  Unlike  the  acceptance  test,  this  second 
stage  of  control  may  result  in  a  significant  modification  of  the  complete  solu¬ 
tion.  Hence  careful  consideration  should  be  given  to  this  approximation  in 
order  to  minimize  the  effect  on  filter  performance .  The  design  of  such  reduction 
algorithms  is  the  main  subject  uf  this  Report. 

In  section  2  the  Bayesian  solution  of  the  tracking  problem  is  briefly 
b  discussed  and  a  set  of  requirements  for  a  reduction  algorithm  is  formulated. 

Previous  approaches  to  this  problem  are  also  discussed.  The  design  of  reduction 
|  algorithms  is  reported  in  sections  3  to  5,  and  in  section  6  the  performance  of 

the  algorithms  is  assessed  by  simulation  for  a  particular  tracking  example. 

This  assessment  also  compares  the  performance  of  the  proposed  algorithms  with 
the  pooolar  Probabilistic  Data  Association  Filter  (PUAF)  approach. 


2  MIXTURE  DISTRIBUTIONS  AND  THE  REQUIREMENTS  OF  A  REDUCTION  ALGORITHM 

Bayes  theory  provides  an  ideal  approach  to  the  tracking  problem.  In  this 

approach  the  probability  density  function  (pdf)  p(x)  of  the  state  vector  x 

at  time  is  constructed  using  all  available  information.  Here  x  is  the 

vector  of  parameters  to  be  estimated.  When  further  sensor  measurements  become 

available  at  time  t.  .  ,  this  information  is  used  to  update  the  pdf  using  Bayes 
▼  1 

theorem.  In  principle,  an  optimal  estimate  for  any  desired  criterion,  such  as 
minimum  mean  square  error,  may  be  obtained  from  the  pdf  p(x)  . 

The  solution  of  tracking  problems  involving  measurement  uncertainty  leads 
to  mixture  distributions  for  the  required  state  vector.  A  mixture  distribution 
has  a  pdf  of  the  form: 


p  (x) 


l  w 


where  p.(x)  is  a  component  pdf  and  B.  is  a  probability  associated  with  the 
ith  component  such  that 


>  0  and 


N 

I' 


For  the  tracking  problem  each  component  of  the  mixture  corresponds  to  a  possible 
track,  and  is  the  probability  that  the  assumed  measurement  history  for 
track  i  is  correct.  If  the  equations  of  motion  of  the  object  to  be  tracked 
are  linear  with  Gaussian  disturbances,  and  measurements  originating  from  the 
object  are  linearly  related  to  the  state  vector  but  corrupted  by  Gaussian 
measurement  noise,  then  each  of  the  mixture  components  is  a  Gaussian  distribution 
(see  Refs  t ,  2  and  Appendix  C,  where  this  result  is  derived  for  a  particular 
tracking  problem): 

p.Xx)  -  JT^  ;  u.  ,  Vj  , 

where  is  the  mean  of  the  Gaussian  distribution  and 
f*£  is  the  covariance  matrix. 

In  this  case  p(x)  is  known  as  a  Gaussian  mixture  (see  Ref  6)  and  each  component 
may  be  thought  of  as  the  output  of  a  Kalman  tracking  filter. 
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While  uncertainty  persists,  the  number  of  components  in  the  mixture  for 
the  full  Bayesian  solution  will  grow  as  tracking  proceeds.  The  coarse  acceptance 
test  may  be  employed  to  cut  down  the  number  of  feasible  measurement  histories  to 
consider  (and  so  the  number  of  components)  by  rejecting  very  unlikely  measure¬ 
ments.  However  if  more  than  one  measurement  is  passed  by  the  acceptance  test 
at  each  time  step,  the  number  of  mixture  components  will  still  increase.  Since 
every  component  must  be  propagated  at  each  time  step,  to  implement  a  tracking 
filter  based  on  the  Bayesian  solution,  it  is  essential  to  control  the  growth  in 
the  number  of  components.  Here,  it  is  considered  that  the  control  should  be 
exercised  by  a  reduction  algorithm  which  fulfils  the  following  requirements: 

(i)  The  approximation  should  result  in  another  Gaussian  mixture.  This 
is  necessary  to  preserve  the  basic  tracking  filter  algorithm  which  is  a 
bank  of  Kalman  filters. 

(ii)  The  algorithm  should  allow  the  maximum  number  N^,  of  components 
after  approximation  to  be  chosen  as  desired. 

(iii)  Whenever  possible,  reduction  should  be  achieved  without  modifying 
the  'structure'  of  the  distribution  beyond  some  acceptable  limit.  Con¬ 
versely,  to  avoid  retaining  unnecessary  components,  reduction  should  con¬ 
tinue  until  this  limit  is  reached,  so  that  the  approximation  may  contain 
less  than  N^,  components. 

(iv)  Intuitively  the  approximation  should  preserve  the  mean  and 
covariance  of  the  original  mixture. 

(v)  The  reduction  algorithm  should  be  computationally  efficient,  even 
when  the  original  mixture  consists  of  a  large  number  of  components  (for 
example  over  tOO),  each  with  a  different  covariance  matrix. 

Of  these  requirements,  number  (iii)  needs  further  comment.  Ideally  the 
reduction  algorithm  should  attempt  to  maintain  some  level  of  filter  performance 
within  the  limit  of  components.  A  suitable  performance  measure  would  be 
the  probability  of  losing  track.  Unfortunately  the  relationship  between  this 
performance  measure  and  modification  of  the  mixture  distribution  cannot  be  easily 
determined  (but  see  section  6).  However  .  i  i.a  Hkely  that  the  performance 
penalty  will  be  related  to  the  extent  to  which  the  approximation  modifies  the 
structure  of  the  distribution.  Hence  requirement  (iii)  is  written  in  terms  of 
distribution  structure  (see  below). 
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A  number  of  techniques  for  controlling  the  growth  of  the  mixture  distri- 

3  . 

bution  have  been  reported.  A  popular  and  economical  approach  is  the  PDAF  m 
which  the  Gaussian  mixture  is  approximated  by  a  single  Gaussian  component  at 
every  time  step.  However  if  well  spaced  components  are  present,  the  approximation 
may  destroy  important  structure  in  the  distribution.  Methods  which  allow  for  the 
retention  of  more  than  one  component  include  the  N-scan  memory  filter  of 

1  •  4  «  .  , 

Singer  et  al  ,  and  the  direct  approximation  approaches  of  Alspach  and  Lainiotis 
and  Park^.  None  of  these  techniques  ensures  that  the  maximum  number  of  components 
in  the  approximation  is  always  within  a  specified  limit,  and  Ref  1  does  not  use  a 
direct  measure  of  mixture  structure.  Ref  4  assures  that  all  components  have  the 
same  covariance  and  the  method  of  Ref  5  would  be  very  time  consuming. 

In  the  following  sections  two  new  mixture  reduction  algorithms  which  meet 
all  of  the  above  requirements  are  proposed.  These  algorithms  operate  by  merging 
similar  components  together.  In  the  first  of  these  algorithms,  the  Joining 
Algorithm,  a  single  pair  of  the  'most  similar'  components  are  merged  at  every 
iteration.  In  the  second  algorithm,  the  Clustering  Algorithm,  groups  of  similar 
components  are  merged  at  each  iteration.  The  second  method  should  be  computation¬ 
ally  more  efficient,  but  with  the  former,  the  reduction  process  can  be  more  finely 
regulated  so  that  over-reduction  is  avoided.  Both  techniques  are  based  on  the 
same  measure  of  mixture  structure  modification  (requirement  (iii)),  which  is 
derived  from  a  decomposition  of  the  mixture  covariance  matrix. 

3  MIXTURE  STRUCTURE:  THE  COVARIANCE  MATRIX 

The  covariance  matrix  P  of  any  mixture  distribution  with  N  components 
may  be  decomposed  into  two  contributions,  W  and  B  (see  Appendix  A.  1) ; 

P  -  W  ♦  B 


where 


n 

u  -  y  s  j 

L-j  1 


8  ‘  £  »i(*i  -  8x*i  -  8)1 


A 

L  1 


is  the  mean  of  the  distribution,  and 
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|i .  ,  P.  and  3.  are  defined  in  section  2. 

"i*i  l 

The  matrix  W  may  be  interpreted  as  the  contribution  from  the  covariance  'within1 
each  component  of  the  mixture,  while  B  may  be  interpreted  as  the  between  com¬ 
ponent  contribution  due  to  the  separation  between  the  mixture  components.  B  and 
W  are  both  symmetric  matrices,  W  being  positive  definite  and  B  being  positive 
semi-definite. 

Suppose  that  the  mixture  distribution  is  approximated  by  merging  several 
components  together.  If  I  is  the  set  of  subscripts  of  components  to  be  merged, 
then  in  order  to  preserve  the  mean  and  covariance  of  the  mixture,  the  probability 
mass  3'  ,  the  mean  u*  and  the  covariance  P'  of  the  new  component  should  be 
chosen  (see  Appendix  A. 2)  as: 


icG 


Although  the  overall  covariance  matrix  V  is  unchanged,  this  merging  of 
components  results  in  a  loss  of  between  component  covariance  B  which  is  balanced 
by  an  increase  in  U  .  Hore  precisely,  the  difference  L  *  W  -  V  is  a  positive 
semi  definite  matrix  given  by  (see  Appendix  A. 3): 


This  shift  of  covariance  from  U  to  U  provides  a  useful  measure  of  the 

change  in  the  structure  of  a  mixture  distribution  when  components  are  combined. 

(A  similar  matrix  decomposition  has  been  used  in  Cluster  Analysis,  which  is  con- 

6 

corned  with  the  grouping  of  data  points  into  natural  clusters  -  see  Haud  .) 

A  TUB  JOINING  ALCOjUTHM 

Ideally  the  final  partition  of  components  into  sets  for  merging  should  be 
such  that  the  increase  in  some  cost  function  is  minimised.  However,  to  reduce 
the  mixture  from  N  to  H  components,  this  could  involve  the  evaluation  of  the 
criterion  for  every  possible  partition  to  identify  the  minimum.  Such  a  procedure 


xT 

o 
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for  a  number  of  different  values  of  M  would  be  far  too  time  consuming  and  so  a 

suboptimal  approach  has  been  adapted  from  the  agglomerative  methods  of  Cluster 
6 

Analysis  (see  Hand  ) .  In  this  approach,  which  we  call  the  Joining  Algorithm, 
a  pair  of  components  are  merged  at  every  iteration  of  the  algorithm.  The  com¬ 
ponents  for  merging  are  chosen  to  minimize  the  increase  in  the  chosen  cost  function 
at  each  stage.  Clearly  there  is  no  guarantee  that  the  final  partition  from  such 
a  procedure  will  achieve  the  smallest  possible  value  of  the  cost  function. 

To  implement  the  Joining  Algorithm  using  a  cost  function  based  on  an 
increase  in  the  within  component  covariance,  we  require  a  suitable  scalar  measure. 
If  components  i  and  j  are  merged,  the  increase  in  W  is  given  (see 
Appendix  A. 4)  by: 


One  possible  measure  is  the  trace  of  L-.  which  is  the  squared  Euclidean 

distance  between  component  means  modified  by  the  factor  +  B.)  .  However 

r  j  x  j 

this  has  the  disadvantage  that  it  is  dependent  on  the  scaling  of  the  elements  of 
the  state  vector  and  so  is  problem  dependent.  This  difficulty  is  avoided  by 
using  the  Hahalanobis  distance  (see  Ref  6)  to  give: 


This  measure  is  invariant  under  all  non  singular  linear  transformations  of 
the  state  vector  (see  Appendix  A.6) ,  At  each  iteration  of  the  Joining  Algorithm, 
the  two  components  which  are  closest  in  the  sense  of  the  distance  measure, 
equation  (2?,  are  combined  to  form  a  new  component  defined  by  the  relations, 
equation  (l). 

The  minimum  value  of  the  distance  measure  at  each  iteration  is  an  indicator 
of  the  change  in  distribution  structure  resulting  from  the  merging  of  the  two 
closest  components.  It  is  shown  in  Appendix  B  that  this  minimum  distance 
increases  mouotoaically  as  reduction  proceeds  and  so  each  merging  operation 
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increases  this  measure  of  structural  modification.  Thus  if  a  threshold  T 

defining  the  maximum  acceptable  modification  to  the  distribution  is  specified, 

approximation  should  proceed  until  the  minimum  distance  exceeds  this  threshold. 

In  choosing  a  value  for  the  threshold  T  >  it  is  useful  to  note  (see  Appendix 

2 

A. 7)  that  the  distance  d..  is  bounded: 


djj  <  dim(x) 


Simulation  studies  indicate  that  a  value  of 


T  =  0.001  dim(x) 


retains  sufficient  components  to  give,  on  visual  inspection,  a  good  approximation 


to  the  mixture.  At  each  iteration,  the  algorithm  determines  the  number  NR  of 
remaining  components,  excluding  the  set  of  smallest  components  with  total  prob¬ 


ability  mass  (ie  the  sura  of  their  8  weights)  less  than  B^  » 
T  bef-iC  Nc  has  been  reduced  below  the  specified  maximum  N, 


If  d . .  exceeds 


,  then  approxi¬ 


mation  continues  beyond  the  acceptable  limit  of  modification.  The  purpose  of 


&£>  ,  which  has  been  set  to  0.01,  is  to  avoid  wasting  effort  on  grouping  insig¬ 
nificant  components. 


2 

To  implement  the  Joining  Algorithm,  a  matrix  Cd. .)  containing  the  distance 

between  every  pair  of  components  in  the  original  mixture  is  evaluated  using 

equation  (2).  Note  the  matrix  (d..)  Is  symmetric  and  d..  **.0  ,  so  that  only 

the  upper  triangular  part  of  the  matrix  need  be  evaluated.  At  each  iteration 

the  smallest  element  of  the  matrix  (for  i  <  j)  is  found  and  the  corresponding 

pair  of  components  are  merged  using  the  formulae  (1).  Then  row  j  and  column  j 

of  the  distance  matrix  are  deleted,  the  new  component  is  written  into  storage 

location  i  ,  and  row  i  and  column  i  of  the  matrix  are  revaluatod  using  (2), 

Again  all  processing  is  confined  to  the  upper  triangle  of  the  matrix.  Mote 

that  since  the  merging  of  components  preserves  the  covariance  matrix  P  ,  only 

one  matrix  inversion  suffices  for  all  distance  evaluations.  Algorithm  iterations 

continue  until  the  stopping  criteria  are  satisfied  (see  the  flow  diagram  of  the 

2 

algorithm  given  in  fig  1).  The  storage  requirement  for  the  algorithm  is  0(M  )  , 

2 

the  number  of  distance  evaluations  is  0(N  )  and  tine  number  of  comparisons  is 
u(N3)  ,  where  N  is  the  number  *>f  components  in  the  original  mixture* 


Tigs  2  to  4  show  an  example  of  mixture  reduction  with  the  Joining  Algorithm 
for  a  two  dimensional  distribution.  Note  that  for  «*  10  the  approximation 
appears  to  be  very  good,  although  for  Hj,  -  4  several  important  components  have 
been  combined. 


10 


5  THE  CLUSTERING  ALGORITHM 

The  second  algorithm  is  based  on  the  proposition  that  the  mixture  com¬ 
ponents  with  the  largest  0  weightings  carry  the  most  important  information. 

Thus  starting  with  the  largest  component ,  this  algorithm  gathers  in  all  surround¬ 
ing  components  that  are  close  to  the  principal  component.  Subsequently  the 
largest  component  of  the  remainder  is  selected  and  the  process  is  repeated  until 
all  the  components  have  been  clustered.  This  is  called  the  Clustering  Algorithm. 

The  distance  measure  chosen  to  represent  the  closeness  of  component  i  to 
the  cluster  centre  is  defined  by 


where  Sc  ,  uc  and  arc  the  parameters  of  the  principal  component,  and  B. 

and  u.  are  the  probability  mass  and  mean  of  the  ith  component.  This  is  the 

2 

same  as  the  distance  measure  d.  ^  of  the  previous  section,  except  the  distance 

is  normalized  to  the  covariance  of  the  cluster  centre  rather  than  the  complete 

2 

mixture.  Any  component  i  for  which  <  Tj  is  selected  as  a  cluster  member. 
The  threshold  defines  the  acceptable  modification  to  the  distribution. 

*2 

In  c hosing  T(  ,  it  is  Helpful  to  first  consider  the  measure 
defined  by; 

ti  P  Vij.  “  il  \ 

i  V-I  c  \~1  -c/ 

If  the  criterion  for  clustering  a  component  i  were  D.  <  T,  ,  then  any  com- 

*  •  » 

ponent  i  whose  mean  were  to  fall  within  the  hyperellipsoid  defined  by  T 

I 

would  be  clustered.  This  hyperellipsoid  is  a  contour  of  constant  probability 
density  of  the  principal  component,  and  the  proportion  of  probability  mass 

# 

enclosed  is  a  measure  of  the  selectivity  of  the  clustering  operation.  If  T( 
were  chosen  so  that  only  a  small  proportion,  say  it,  of  the  probability  mass  of 
the  cluster  centre  were  enclosed,  then  the  structure  of  the  distribution  should 
be  little  altered  by  clustering.  However  0***  is  independent  of  the  probability 
mass  (£L)  of  the  component,  and  intuitively,  merging  a  large  component  would 
have  a  greater  effect  or  the  mixture  than  merging  a  small  component .  the 
modifying  factor  0^8  *(6^  ♦  0£)  biases  this  distance  so  that  small  components 
are  more  easily  clustered  while  large  components  retain  their  individuality.  It 
is  suggested  chat  the  threshold  for 
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should  be  chosen  so  that  small  components  with  0  weights  less  than  0.05  are 
more  readily  clustered  while  components  with  S  weights  exceeding  0.05  are 
clustered  less  readily.  Fig  5  shows  that  the  contour 


8A 

Bi  * s, 


0.05 


is  close  to  the  line  0^  *  0.05  inside  the  region  of  interest,  except  when  0^ 

is  nearly  equal  to  6  .  Thus  it  is  suggested  that  to  give  a  good  mixture 

c  2 

approximation,  the  threshold  for  should  be  set  to 


Tj  -  0.05tJ  , 


where  T,  defines  the  hyperellipsoid  containing  only  12  of  the  probability 

■  2 

mass.  (Tj  can  be  found  from  tables  of  k  ,) 

Each  cluster  of  components  (seme  clusters  may  consist  of  a  single  component) 
is  approximated  by  a  single  Gaussian  defined  by  equation  (1).  Clustering  proceeds 
until  the  probability  mass 'of  the  uncluttered  components  is  less  than  U.f  .  As 
for  the  Joining  Algorithm,  the  purpose  of  0^.  ,  which  is  set  to  0.01,  is  to 
avoid  wasting  effort  on  clustering  insignificant  components .  If  the  number  of 
clusters  is  less  than  or  equal  to  Nx  ,  the  unclustered  components  are  deleted 

at 

and  approximation  is  complete;  otherwise  further  reduction  is  necessary.  This 
is  achieved  by  repeating  the  clustering  procedure  on  the  first  approximation, 
but  with  the  clustering  threshold  incremented  by  AT  .  This  clustering  operation 
is  iterated  until  the  necessary  reduction  has  been  effected.  The  choice  of  the 
Increment  AT  is  a  compromise  between  the  number  of  iterations  required  and  tint 
possibility  of  clustering  more  components  than  necessary,  in  this  study,  the 
value  of  AT  is  fixed; 

AT  -  0.05  AT' 

where  T'  ♦  AT*  defines  the  hyperellipsoid  which  contains  62  of  the  probability 
mass  of  the  principal  component.  (Simulation  work  has  shown  this  to  be  a  reason-  ^ 

able  compromise.)  however  an  override  is  provided  which  may  increase  the  cluster-  | 

iog  threshold  further  to  ensure  that  at  least  one  component  is  clustered  on  each 
iteration.  A  flow  diagram  of  the  algorithm  is  given  in  Fig  6. 


The  computational  cost  of  this  algorithm  depends  on  how  many  iterations  are 
required  to  adequately  reduce  the  mixture.  It  is  most  efficient  when  all  approxi¬ 
mation  is  accomplished  within  a  single  iteration,  which  required  between  N  and 
NM  distance  evaluations  and  comparisons,  and  M  matrix  inversions  to  reduce  the 
mixture  from  N  to  M  components.  In  the  worst  case  when  only  one  component 

is  clustered  on  each  iteration,  the  number  of  distance  evaluations  and  comparisons 

3  2 

is  0(N  )  and  the  number  of  matrix  inversions  is  0(N  )  .  The  algorithm 

requires  minimal  extra  storage  above  that  needed  to  hold  the  mixture  components. 

Figs  7  and  8  show  the  result  of  applying  the  Clustering  Algorithm  to  the 
mixture  shown  in  Fig  2.  Note  that  for  N^,  =  10  ,  the  approximation  is  very 
similar  to  that  produced  by  the  Joining  Algorithm  (Fig  3) .  However  for  NT  =  4 
there  are  clear  differences  between  the  approximations  from  the  two  algorithms 
(Figs  4  and  8) . 

6  COMPARISON  OF  FILTER  PERFORMANCE  AND  THE  EFFECT  OF  VARYING  NT 
6. 1  The  tracking  problem 

The  main  object  of  this  simulation  is  to  compare  the  performance  of  tracking 
filters  using  the  Joining  Algorithm,'  the  Clustering  Algorithm  and  the  PDAF  approxi¬ 
mation.  The  performance  of  these  filters  has  been  assessed  for  the  problem  of 
tracking  an  object  moving  in  a  plane,  using  measurements  produced  by  a  sensor 
(the  same  problem  is  considered  in  Ref  7).  The  simulation  is  in  three  parts: 

(i)  the  generation  of  the  object  trajectory  and  the  sensor 

measurements; 

(ii)  the  implementation  of  the  tracking  filters; 

(iii)  the  assessment  of  the  filters'  performance. 

8 

Object  trajectories  have  been  generated  from  an  a-fi  model  .  This  model 
has  been  widely  used  in  tracking  problems  as  it  is  simple,  while  providing  an 
adequate  trajectory  representation  for  many  practical  cases.  The  trajectory 
described  by  the  model  is  a  variation  about  a  constant  velocity  course,  whose 
magnitude  and  direction  are  defined  by  initial  conditions.  The  deviation  from 
this  mean  course  is  controlled  by  the  variance  q  of  the  model  driving  noise. 

The  a~8  model  is  defined  by  the  iollowing  equation: 
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where  the  state  vector  represents  the  position  and  velocity  of  the  object 
at  time  kAt  : 


2k 


(x  ,  oc  ,  y  ,  y) 

At  is  the  time  step  between  measurements,  and 


T 

k  * 


w^  is  a  2  x  1  vector  from  a  Gaussian  random  sequence  with  zero  mean  and  constant 
covariance 


•CD- 


Thus,  to  generate  a  trajectory  jx^j  ,  Gaussian  random  numbers  of  variance  q 
were  fed  through  the  recurrence  relation  (3),  starting  from  some  initial 


condition  . 


At  en  i  .’me  step  a  set  of  Cartesian  position  measurements  have  been 
generated  to  simulate  sensor  measurements*  It  is  assumed  that  the  probability 
Pp  of  detecting  the  object  is  unity,  so  exactly  one  measurement  in  each  set 
originates  from  the  object.  This  is  called  the  true  measurement  and  it  is  a 
Gaussian  perturbation  about  the  poaition  of  the  object.  It  is  generated  from 
the  state  vector  using  the  equation 


*  *  0. 
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where  is  a  2  *  1  vector  of  Gaussian  measurement  noise  with  aero  mean  and 


constant  covariance 
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The  other  measurements  are  independent  of  the  object  and  are  called  false 
measurements.  The?"  are  uniformly  distributed  over  the  sensor  surveillance 
region,  with  density  p  per  unit  area.  At  each  time  step,  the  surveillance 
region  of  the  sensor  is  arranged  to  be  sufficiently  extensive  to  include  the 
object  position  and  the  acceptance  regions  of  the  filters,  while  track  is 
maintained.  False  measurements  were  simulated  by  generating  A^p  pairs  of 
uniformly  distributed  random  numbers  with  appropriate  scaling;  A^  being  the 
area  of  the  surveillance  region. 

At  each  time  step,  every  simulated  measurement  is  passed  to  the  tracking 
filters  which  attempt  to  estimate  the  current  value  of  the  object's  state  vector. 
The  following  information  is  available  to  the  filters: 

(i)  the  value  of  the  initial  state  vector  x^  ; 

(ii)  the  model  of  the  object  motion,  equation  (3); 

(iii)  the  relationship  between  the  state  vector  and  the  true  measurement, 

equation  (A); 

(iv)  the  statistics  of  the  false  measurements,  the  true  measurement 

noise  and  the  model  driving  noise,  including  the  values  of  p,  r  and  q  ; 

(v)  the  detection  probability  of  the  sensor. 

The  tracking  filters  do  not  know: 

(a)  the  values  of  the  state  vector  ,  or  the  noise  vectors 
and  at  each  time  step  (k  $  0); 

(b)  the  identity  of  the  true  measurement. 

As  indicated,  three  filters  have  been  implemented;  the  Joining  Algorithm  filter 
(JAF),  the  Cluttering  Algorithm  filter  (CAF)  aud  the  PDAP.  As  already  discussed, 
each  of  these  filters  is  based  on  the  Bayesian  solution  of  the  Above  problem 
(see  Appendix  C)  end  each  uses  the  coarse  acceptance  test  described  in  Appendix  D. 
The  only  difference  between  the  filters  is  the  mixture  reduction  algorithm 
employed.  However  for  the  PDAF,  which  approximates  the  mixture  by  a  single 
Gaussian  component,  a  full  propagation  of  all  components  is  unnecessary  and  a 
very  efficient  filter  algorithm  may  be  used^, 

the  performance  of  the  filters  was  assessed  by  measuring  how  long  each  of 
the  three  filters  wit  able  to  maintain  track  on  the  object,  ie  the  track  lifetime* 
Each  filter  was  allowed  to  continue  tracking  the  object  until  track  was  lose*  A 
track  was  deemed  to  be  lost  if  either  of  the  following  criteria  were  satisfied: 


(a)  The  true  measurement  'is  rejected  by  the  acceptance  test  for  five 
consecutive  time  steps. 

<b>  \\  -  \\  >  1°  <>xk  lyk  _  7kl  >  1°  °yk  for  five 

consecutive  time  steps, 

A  A 

where  is  the  filter  estimate  (the  mean  of  the  posterior 

distribution)  of  the  object  position  at  time  step  k  , 

,  y^  is  the  actual  object  position  at  time  step  k  ,  and 

and  o  ^  are  the  standard  deviations  of  the  position 
estimates  of  the  equivalent  Kalman  filter  ( ie  the  optimal  filter 
for  the  same  problem  but  with  p  »  0  ) . 

6.2  Choice  of  problem  parameters 

To  analyse  this  tracking  problem  it  is  convenient  to  normalize  the 
variables,  so  that  the  unit  of  time  is  At  and  the  unit  of  distance  is  ft  . 
Then  the  non-dimensional  form  of  the  state  vector  is 


If  cha  target  model  and  measurement  equations  are  written  in  this  form,  it  can 
be  shown  that  the  statistics  of  the  problem  are  completely  described  by  three 
nm-dimcnsional  parameters. 

(i)  t  the  ratio  which  determines  the  values  of  the  filter  gains 
*  *0 

for  the  standard  o-f  filter*  ,  id  in  the  absence  of  false  measurements. 
As  this  parameter  increases,  the  0-6  filter  becomes  more  responsive  to 
position  measurements. 

(ii)  pr  ,  the  expected  number  f  false  measurements  falling  withii;  a 
square  whose  aide  is  one  standard  deviation  of  the  measurement  error. 

(iii)  rD  *  the  detection  probability,  assumed  to  be  unity. 

Since  the  initial  state  vector  is  assumed  to  be  known  perfectly,  the  filter  per¬ 
formance  in  normalized  co-ordinates  should  only  depend  on  these  three  f  uaoeters 
(This  is  because  the  problem  may  be  written  as  the  estimation  of  the  deviation 
about  the  nominal  constant  velocity  course  defined  by  the  initial  state  vector.) 
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All  simulation  results  reported  here  are  for  a  single  point  in  this 


parameter  space: 


4 

qAt 

r 


0.012 


These  values  have  been  chosen  to  illustrate  the  possible  improvement  in  t .acking 
performance  of  the  new  reduction  algorithms  over  the  PDAF.  However,  it  is 
believed  that  the  region  of  the  parameter  space  where  there  is  a  significant 
improvement  is  extensive,  and  a  full  investigation  of  filter  performance  over 
the  space  will  be  reported  separately.  Another  factor  in  the  choice  of  the  above 
parameter  values  was  a  requirement  for  modest  track  lifetimes,  to  avoid  excessive 
computation  costs. 

One  hundred  object  trajectories  with  associated  measurements  were  generated 
so  that  the  mean  track  lifetime  and  the  distribution  of  lifetimes  could  be 
estimated.  The  initial  object  position  was  taken  as  the  origin  and  the  initial 
speei-  was  10/r/At  .  The  initial  heading  of  the  object  was  chosen  randomly  for 
each  trajectory.  For  the  chosen  problem  parameters,  the  equivalent  Kalman  filter 
rapidly  reaches  steady  state  conditions,  and  the  standard  deviation  of  the  posi¬ 
tion  error  on  one  of  the  co-ordinates  approaches  within  135  of  its  final  steady 
state  value  after  only  four  time  stops  Also  if  the  track  of  the  object  is 
estimated  by  assuming  a  constant  velocity  course  and  extrapolating  from  the 
initial  perfect  data  (in  ignoring  all  measurements),  the  number  k  of  time  steps 
for  the  standard  deviation  of  the  error  on  one  of  the  co-ordinates,  say  x  ,  to 
exceed  10  o^  (see  track  loss  criterion  (b))  is  7.  These  two  figures  provide 
a  useful  timescale  when  considering  the  results  of  the  simulation. 

6.3  Results 

6.3.1  Average  number  of  time  steps  to  track  loss 

Rig  9  shows  the  average  number  NAV£  of  time  steps  until  track  loss  as  a 
function  ofHj.  ,  for  filters  using  the  Clustering  Algorithm  and  the  Joining 
Algorithm  with  thresholds  set  to  the  values  given  in  sections  4  and  5.  NT  -  1 
corresponds  to  the  special  case  of  the  PDAR,  and  clearly  the  filters  which  retain 
more  than  one  mixture  component  perform  better  titan  the  PDAt\  The  Joining 
Algorithm  filter  gives  slightly  larger  values  of  hAVg  than  the  Clustering 
Algorithm,  possibly  due  to  the  settings  of  T  and  . 
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Also  shown  in  Fig  9  is  the'  filter  performance  for  the  JAF  with  T  =»  0,  ie 
with  the  acceptable  modification  check  switched  off.  Note  that  the  original 
setting  of  T  for  the  JAF  does  not  significantly  degrade  the  filter's  perform¬ 
ance,  and  that  the  performance  for  all  three  cases  shown  in  Fig  9  is  similar. 

For  Nt  <  10,  Nave  rises  approximately  linearly  with  ,  while  for  N^,  >  10, 

is  nearly  constant.  For  the  JAF  with  T  =  0  and  very  large,  the 
mixture  is  not  subject  to  approximation,  and  so  this  constant  level  is  the 
optimal  value  of  N^VE  . 

Fig  10  shows  the  average  number  of  mixture  components  before  and  after 
reduction  for  the  k.hree  cases  of  Fig  9.  Comparing  Fig  lOa&b  with  Fig  10c,  the 
effect  of  the  acceptable  modification  check,  defined  by  or  T  ,  in 
regulating  the  number  of  components  for  the  large  values  of  N^  is  obvious. 

For  small  values  of  N^,  ,  the  approximation  for  all  three  cases  is  principally 
controlled  by  N^  itself.  For  this  example,  and  T  become  the  main 
regulators  of  the  approximation  at  about  ■  10,  so  the  acceptable  modification 
check  appears  to  select  the  minimum  number  of  components  for  near  optimal  per¬ 
formance.  Clearly  this  cannot  be  guaranteed  for  other  tracking  problems,  but 
since  the  thresholds  were  not  specially  tuned  for  this  simulation,  the  perform¬ 
ance  with  other  problems  may  not  be  far  from  optimal. 


For  an  interpretation  of  these  results,  it  is  useful  to  view  the  generation 
of  mixture  components  as  the  filter’s  way  of  keeping  options  open  when  the  choice 
of  the  true  measurement  is  uncertain.  First  consider  the  optimal  case  when  the 
mixture  is  not  approximated.  In  this  case,  track  will  still  be  lost  (sccording 
to  the  criteria  of  section  6.1)  when  components  corresponding  to  false  measure¬ 
ments  are  given  a  high  probability  weighting  (0)  through  a  chance  event.  For 
instance,  a  manoeuvre  by  the  object  under  track  might  coincide  with  the  produc¬ 
tion  of  a  false  measurement  on  the  original  heading,  causing  the  filter  to  give 
a  high  weighting  to  the  false  measurement.  Several  such  occurrences  could  lead 
the  mean  position  estimate  away  from  the  actual  object  position  so  satisfying 
the  track  loss  criteria.  (Note  that  the  average  track. lifetime  depends  on  the 
track  loss  criteria.)  Clearly  the  probability  of  such  occurrences  is  likely  to 
increase  with  the  ’difficulty*  of  the  tracking  problem,  for  example  if  the 
deneity  p  of  false  measurements  were  to  be  increased.  Thus,  in  agreement  with 


intuition,  we  expect  the  average  track  survival  time  NAV£ 


to  decrease  with 


increasing  problem  difficulty  for  given  (sensible)  track  loss  criteria* 
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p  Now  consider  the  effect  of  the  reduction  algorithms.  Even  without  approxi- 

|  mation,  at  any  time  step,  many  components  of  the  distribution  are  almost  identical 

f 

!'  so  that  the  complete  mixture  distribution  appears  to  consist  of  only  a  limited 

|  number  of  clearly  distinct,  significant  components.  For  example,  the  distribution 

shown  in  Fig  2  comprises  37  components,  although  many  of  these  are  almost  ident¬ 
ical.  The  reduction  algorithms  attempt  to  combine  the  most  similar  components 
and  if  this  can  be  accomplished  without  merging  the  significant  distinct  com¬ 
ponents  (see  Figs  2  and  3)  little  degradation  in  tracking  performance  is  to  be 
expected.  However  if  the  number  of  components  retained  falls  below  some  critical 
level,  these  key  components  will  be  merged  and  tracking  performance  will  deterio¬ 
rate  progressively  as  the  permitted  number  of  components  is  reduced.  In  the 
current  example  N^  *  10  appears  to  be  the  critical  level  at  which  tracking 
performance  begins  to  degrade. 

6.3.2  Distribution  of  number  of  time  steps  to  track  loss 

In  the  previous  section,  the  average  track  lifetime  was  discussed.  In  this 
section  we  consider  the  distribution  of  track  lifetimes  about  this  mean.  To 
illustrate  the  distribution  and  to  compare  the  performance  of  the  CAF  and  JAF 
for  individual  replications,  the  track  maintainence  times  have  been  plotted  in 
Figs  11  to  13  for  ■  2,  4  and  30  respectively.  In  these  diagrams  each  point 
corresponds  to  a  single  replication,  and  the  X  and  Y  co-ordinates  of  the 
point  are  the  time  steps  at  which  the  JAF  and  CAF  (with  original  threshold 
settings)  lost  track  respectively.  So  points  falling  on  the  X  *  V  line  indicate 
that  both  filters  lost  track  uoineidently .  For  large  values  of  (eg  *  30, 
Fig  13),  the  performance  of  the  two  filters  is  remarkably  similar  for  the 
majority  of  replications.  Hie  few  replications  biasing  in  favour  of  the 

JAF  are  obvious.  For  small  values  of  (eg  N^,  “  2,  Fig  11),  the  points  are 
scattered  further  from  X  -  Y,  although  is  almost  identical  for  the  two 

filters.  These  results  hear  out  the  observation  that  the  mixture  approximations 
produced  by  the  two  reduction  algorithms  are  usually  very  similar  for  large  , 
while  for  small  there  are  often  clear  differences. 

Figs  14  and  IS  show  histograms  of  the  data  points  from  Figs  11  and  13s  io 
for  the  track  lifetimes  for  the  JAF  and  CAF  with  »  2  and  -  30.  It  can  be 
seen  that  those  track  lifetimes  exceeding  20  time  steps  can  be  well  fitted  by  an 
|  exponential  distribution  of  the  form: 


o 

o 
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the  average  lifetime  of  tracks  which  survive  for  at  least 
This  is  confirmed  by  a  x  test,  the  exponential  hypoth- 
esis  is  only  once  rejected  at  the  5Z  level  of  significance  for  any  of  the  24  sets 
of  replications.  This  exponential  distribution  indicates  that  after  20  time 
steps,  the  probability  of  losing  track  is  independent  of  track  lifetime,  ie 
after  an  Initial  transient  the  filters  reach  steady  state  conditions.  The  value 
Sdin  *  was  chosen  by  examining  the  transient  behaviour  of  the  equivalent 
Kalman  filter  (see  last  paragraph  of  section  6.2)  and  by  inspection  of  the 
simulation  results.  The  distribution  parameter  a  may  be  interpreted  as  the 
average  number  of  time  steps  that  a  track  will  survive  in  steady  state  conditions. 
Estimates  of  a  are  shown  in  Fig  16.  These  values  are  slightly  greater  than 
TAvlf20  ,  as  tracks  surviving  for  less  than  20  time  steps  are  excluded. 

6.3*3  Computation  time 

Fig  17  shows  the  average  cpu  time  TAyE  for  the  filters  to  perform  a 
single  time  step.  The  time  scale  is  normalized  to  the  average  cpu  time  for  a 
single  PDAF  time  step  which,  for  the  data  simulated  here,  was  1.12  ms  on  a 
Cray  IS  computer.  The  computational  effort  is  divided  between  the  propagation 
of  mixture  components  or  tracks  (see  Appendix  C)  and  mixture  reduction.  For 
the  two  filters  with  the  original  threshold  settings  (Fig  17a&b),  TAVg  falls 
rapidly  to  nearly  constant  values  for  >  10.  Also  for  low  values  of  most 
time  is  spent  reducing  the  mixture,  and  as  increases  more  time  is  required 
for  track  propagation  while  the  mixture  reduction  time  decreases.  This  is 
explained  by  Fig  10:  the  initial  high  values  of  T^  are  due  to  time  spent 
reducing  large  mixtures  which  result  from  inadequate  approximations  at  valuea 
of  Mt  <  6.  Except  for  the  case  *  6,  the  JAF  was  more  time-consuming  than  the 
CAP,  usually  by  about  SOX,  and  as  expected,  the  execution  times  for  the  filters 
were  in  all  cases  considerably  greater  than  the  PDAF.  However  for  >  10,  the 
eight-fold  increase  in  execution  time  for  the  CAF  may  well  be  an  acceptable  price 
for  the  performance  improvement  offered  by  this  filter. 

The  time  taken  by  the  JAF  with  T  *  0  is  shown  in  Fig  17c.  This  clearly 
showa  the  value  of  the  acceptable  modification  check  in  the  reduction  algorithms: 


where  (t  .  +  a)  is 

mm 

t  .  =20  time  steps, 

mm  r 
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|  for  the  very  small  improvement  for  >  10  over  the  filter  with  the  original 
I  threshold  settings,  there  is  a  large  increase  in  processing  time.  This  extra 
|  time  is  required  for  the  propagation  and  reduction  of  the  extra  tracks  generated 

:!  when  the  full  components  are  retained  for  N^  >  10  (see  Fig  10). 

I 

■1 

|  7  CONCLUSIONS 

(1)  Two  new  mixture  reduction  algorithms  for  uncertain  tracking  have  been 
developed.  These  algorithms  have  been  applied  to  the  optimal  filter  for  tracking 
an  object  in  uniformly  distributed  false  measurements  to  produce  two  practical 
tracking  filters:  the  Joining  Algorithm  filter  (JAF)  and  the  Clustering  Algorithm 

I  filter  (CAF). 

(2)  For  the  chosen  simulation  example  (an  object  moving  according  to  an  a-8 
model)  these  filters  give  a  substantial  performance  improvement  over  the  popular 
FDAF  filter:  average  track  survival  time  (from  an  initially  perfect  track)  may 
be  increased  by  a  factor  of  8. 

(3)  However  the  computation  times  for  these  more  complex  filters  are  also 
greater  than  the  PDAF:  a  factor  of  8  for  the  CAF  and  a  factor  of  13  for  the  JAF. 
Also  computer  memory  requirements  are  increased,  particularly  for  the  JAF. 

(4)  The  simulation  indicates  that  the  minimum  computation  time  and  near 
optimum  performance  are  obtained  when  satisfactory  mixture  approximation  (defined 
by  algorithm  thresholds)  is  achieved  within  the  maximum  number  of  components 
alloued.  If  the  permitted  number  of  mixture  components  is  reduced  below  some 
critical  level,  tracking  performance  will  deteriorate. 

(5)  Under  these  conditions,  the  track  survival  times  for  the  two  filters  were 
identical  on  at  least  85 t  of  the  replications.  This  suggests  that  filter  perform¬ 
ance  is  not  highly  sensitive  to  the  method  of  mixture  reduction,  provided  that  the 
most  important  mixture  components  are  retained. 

(6)  With  continuing  improvements  in  computing  power,  tracking  filters  which 

i:  retain  more  than  one  mixture  component,  such  as  the  JAF  and  CAF,  are  practical 

1;  alternatives  to  the  FDAF  for  problems  involving  measurement  association  ambiguity. 

Further  work  is  necessary  to  assess  the  performance  and  computer  requirements  of 
j  such  filters  for  a  wider  range  of  problems, 

! 

i  •  .  -  .  ■  .  . . 
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PROOF  OF  RESULTS  OF  SECTIONS  3  AND  4 


A.1  Structure  of  mixture  covariance 

Consider  any  mixture  distribution  with  pdf 


p(x)  *  6iPi^ 


and  let  the  mean  of  the  ith  component  be  and  the  covariance  of  the  ith 
component  be  P  ^  ,  !.•; 

The  mean  of  the  mixture  is  defined  by 


X  ■  J  xp(x)dx 

•  ^  si  Jxp-Cxjdx 

* 


The  covariance  matrix  of  the  mixture  is  dofined  by 


p  *  J  (*  “  *)(x  -  x)Tp(x)<lx 

•' » » 

*  J  xxTp(x)dx  “  bbx 


*  hj  5SXPi(5% 


*fcf  > 


pi  -  J  52TPi<5>^  - 
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E  Si(Pi  +  -iSi  ) 


+  xx 


(A-1) 


■  W  +  B 


N 

where  W  »  ^  8  ,  which  depends  on  the  spread  of  each  individual  component 


of  the  mixture. 


N 

and  B  ■  ^  i— i  ~  ^ 


E  eiHA  ■  *  E  ai-i  ■  E  “iM 

E  "  x  j  ,  which  dop- 


T  A 
+  xx 


,  which  depends  on  the  separation  between 


components. 


components 


Suppose  the  reduced  mixture  p^(x)  is  formed  by  merging  several  components 
of  the  original  mixture  p(x)  ,  ie 

pA<£)  *  8V(x)  *  B^Cx) 

; 

where  p*(x)  is  the  new  component  formed  by  merging  those  component*  with  sub“ 
scrips  from  the  set  C  *  To  ensure  that  p^(x)  is  a  proper  pdf »  the  probability 
mass  of  the  new  component  must  be  given  by 
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If  the  means  of  p(x)  and  p  (x)'  are  to  be  equal. 


E  •*  '  »V  *  I  6a 

i  iff® 


Thus  the  mean  of  the  new  component  is  given  by 


F  E  6iai  • 


If  the  covariances  of  P^(x)  and  p(x)  are  to  be  equal,  from  (A— 1 ) 

E  6i(pi  *  «i4)  -  -  ••(*•  *  «V*)  *  E  6i(Pi  +  *£)  - 

i  i  id 

Thus  the  covariance  of  the  new  component  is  given  by 


xx  • 


p’  *  fT 


E  “i(Pi  *  «l6i)  ‘  “’H'T  • 


A.3  Merging  components  result  in  a  loss  of  between  component  covariance 

Let  U  and  W  be  the  within  component  covariance  of  p(x)  and  p^(x) 
respectively,  and  let  B  and  B'  be  the  between  component  covariancs  of  p(x) 
and  p^(x)  respectively  (see  section  A. 1).  Then  since  overall  covariance  P  is 
preserved, 


P  *  «  ♦  B  »  W*  >  B*  . 


Define  the  matrix  L  as 


L  -  B  -  B*  -  W*  -  W 


from  above. 


Append i*  A 


Prom  sections  A.1  and  A. 2, 


W  -  W  =  g'P* 


-  £  Vi 


However 


•  £  8i(pi  ♦  Bi4)  -  6’h  V1  -  £  Vi 

ieC  ieC 

»  8i-i-i  "  ^Ve'1  • 

ie$ 

“Va'1  -  £  e^1  -  £  OiS-el  ■  £  . 


therefore 


t  .  w  -  W  -  £  6ij  Kiel  -  «ia*T  -  +  a’ti,T| 


rn  T 

£  (hi -e')  («»-«')  • 


thus  L  is  a  positive  semidefinite  matrix  and  in  this  sense  the  merging  o£ 
components  results  in  a  loss  of  between  component  covariance* 

A.4  The  loss  of  between  component  covariance  resulting  from  merging  two 
■  components 

Suppose  that  only  two  components,  i  and  j  ,  are  merged*  Then  from  \A-2) , 
the  probability  cues  of  the  new  component  is 

B*  -  B.  ♦  8/ 

1  J 

the  mean  of  the  new  component  is 

■  '  •  Mj  4  M* 

C  0.  ♦  6< 

*  J’ 


*QQ 
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and  the  covariance  of  the  new  component  is 


r(8i?i  *  ®jpj) 


'  ^(^i  *  8j-j  )  (8i^i  *  Cj*ij) 


B  0  T 

Jr(s.Pi  ♦8.P.)*-iJ-(|ii -a.)  . 


From  (A-2) t  the  loss  o£  between  component  covariance  resulting  from  joining  i 
and  j  is 


hi  '  e'p' -(Vi* Vj) 


A. 5  the  relationship  between  d^,  and  L^. 


Consider 


r(p%)  *  ej)(*s  -  ttjf] 

”  e“V^  Ui  ’  " >j) 


*  d*j  * 


from  (2)  of  section  4. 

2  j 

A.6  d. .  is  invariant  under  non-singular  linear  transformation*  of  * 


the  transformation 


^  AX  +  b 


a-3) 
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where  the  inverse  of  A  exists.  If 


i=1 


p(x)  =  2^  ) 


then  under  the  above  transformation 


A 

»<z>  "  /  V^(z>§i>Qi.) 


i=1 


where  ^  =  A^k  +  b 
and  Q.  =  AP.A^  . 

si  l 


The  distance  between  components  i  and  j  of  p(^)  is  given  by 


d2.  - 

ij 


(A-4) 


where  Q  is  the  covariance  of  the  mixture  p(y)  .  From  the  linearity  of  the 

T 

expectation  operator  Q  *•*  APA  .  Also 


k  *  ij  ‘  *(hi  '  Si) 


so  on  substituting  into  (A-^), 


d?  .  « 

ij 


irri:  (*i '  Hj)  *T(«*T)  A(ui  -  sj)  • 


■1 


T/  T\ 

Hence,  since  A  (APA  ]  A  ■  P  ,  the  distance  measure  is  invariant  under  the 
transformation  (A-3). 

A. 7  The  distance  d, .  is  bounded 

_ _ _ ij _ _ 

From  section  A. 3, 


P  »  W  B  ■  W  ♦  B*  +  B  -  B* 


(W  +  B’)  +  L.. 
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where  P  and  W  are  positive  definite  matrices,  and  B*  and  L. 

-■j  1 

positive  semidefinite  matrices.  Multiply  through  by  P  to  give 


I  =  P_1P  -  P-1(W  +  Br)  +  p"1!,..  . 


Taking  the  trace  gives 


tr  j^P-1  (W  +  B')  ]  +  tr(p_1L.^ 


where  n  is  the  dimension  of  the  state  space. 

Hence  since  p”1  and  (W  +  B')  are  both  positive  definite. 


tr  P  (W  +  B*)  >  0 


and  so 


d. .  <  n 
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THE  MINIMUM  DISTANCE  BETWEEN  COMPONENTS  INCREASES  MONOTONICALLY  AS 
REDUCTION  BY  THE  JOINING  ALGORITHM  PROCEEDS 

Suppose  that  at  some  stage  during  mixture  reduction,  the  closest  components 
have  means  x  and  ^  and  weights  6^  and  8^  .  The  distance  between  these 
components  is  ,  where 


d«in  ‘  £(Bx-By)Hs  -  ill' 


where  |  |x  -  jr|  |2  -  (x  -  2)TPH(x  -  jr) 
and  f(6x.6y)  -  8//^  ♦  0y)  . 

As  they  are  closest,  these  components  are  merged  to  produce  a  new  component  with 
mean 

8X*  + 

V7  »  — —  » — 

-  8+0 
x  y 

and  weight 

6  -8+0 
w  x  y 

Now  consider  any  other  component  with  mean  z  and  weight  8  .  The 

distances  d  and  d  between  this  component  and  either  of  the  two  which  have 
xz  yz  ^ 

been  merged  oust  be  greater  than  or  equal  to  d  ,  ,  so 


d  .  4  d 

am  s  xz 


(B-1) 


&.  «  4  ■  £(vOllz  -all*  • 


(B-2) 


To  confirm  that  the  minimum  distance  increases  uonotonicaily  as  reduction  pro¬ 
ceeds,  we  must  prove  that 

4 *  4»  • 
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Hence  from  (B-1)  and  (B-2) 


d2  , 
zw  N  8 


“  +  B  1 (Sy  +  ez)d„in  +  (6V  +  3  )d2.  -  g  d2. 

w  z  (  '  ^  min  \  x  zj  min  pz  min 


+  6*)4 


■  d  . 
min  mm 


Appendix  B 


This  completes  the  proof. 
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BAYESIAN  SOLUTION  OF  AN  UNCERTAIN  TRACKING  PROBLEM 
C.1  Introduction 

This  Appendix  contains  a  formal  statement  of  a  tracking  problem  of  which 
an  example  is  given  in  section  6.  This  tracking  problem,  which  is  taken  from 
Refs  1  and  3,  illustrates  many  of  the  difficulties  of  uncertain  tracking.  The 
purpose  of  this  Appendix  is  to  show  that  the  optimal  solution  of  the  tracking 
problem  generates  Gaussian  mixture  distributions  and  to  specify  the  optimal 
tracking  filter.  The  recurrence  relations  of  the  JAF  and  CAF  (see  section  6.1) 
are  the  same  as  the  optimal  filter,  except  that  received  measurements  are 
subjected  to  a  coarse  acceptance  test  and  the  Gaussian  mixture  (C-21)  is  approxi¬ 
mated  at  each  time  step. 

The  solution  of  the  tracking  problem  is  approached  from  a  Bayesian  point 
of  view  (see  Refs  1  to  A).  We  consider  the  conditional  pdf  of  the  state  vector 
of  the  object  at  time  t^  ,  conditioned  by  all  the  information  available  up  to 
that  time,  This  conditional  pdf  is  a  complete  solution  of  the  tracking  problem. 
In  section  C.3  it  is  shown  that  the  conditional  pdf  is  a  Gaussian  mixture. 
Assuming  the  prior  pdf  of  the  state  at  time  step  k  is  a  Gaussian  mixture 
and  given  the  problem  statement  of  section  C.2,  the  posterior  pdf,  after 
updating  with  measurements  received  at  this  time  step  is  shown  to  be  another 
Gaussian  mixture,  with  an  increased  number  of  components.  This  posterior  pdf 
is  projected  forwards  to  show  that  the  prior  pdf  at  the  following  time  step 
k  +  1  is  also  a  Gaussian  mixture.  Thus  the  solution  is  established  by 
induction. 

C.2  Problem  formulations 

It  is  assumed  that  the  state  vector  x  of  the  object  of  interest  evolves 
according  to  a  linear  equation 

-k+1  •  %  *  rSk  <c-° 

t 

where  is  the  n-coaq>onent  state  vector  at  time  t^  , 

4  is  the  n  x  n  state  transition  matrix* 
r  is  an  n  x  r  matrix 

and  w^  is  an  r -component  vector  of  system  driving  noise  which  has  a  Gaussian 
distribution  with  aero  mean  and  covariance 
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:[sa] 


(C-2) 


Here  Q  is  a  positive  definite  r  *  r  matrix  and  6^  is  the  Kronecker  delta. 
The  state  vector  contains  the  object  position,  and  usually  the  velocity  and 
possibly  other  attributes  of  the  object.  Also  it  is  assumed  that  at  time  t^  , 
the  state  vector  x  ,  is  known  to  have  a  Gaussian  distribution  with  mean  x., 
and  covariance  . 

At  every  time  step  k  (te  at  each  scan),  a  number  of  measurements  are 

received  from  the  sensor.  If  denotes  the  set  of  measurements 

received  at  time  t,  ,  then 

k  * 


^  =  Kj  j  ■ 1 . • 


Each  measurement  z^j  is  a  u-component  vector.  It  is  assumed  that  the  object 
is  well  inside  the  surveillance  region  of  the  sensor,  but  that  the  (known) 
probability  of  detecting  the  object  may  be  less  than  unity.  It  is  also 

assumed  that  at  most  one  of  the  measurements  may  originate  from  the  object. 

If  measurement  z^.  does  originate  from  the  object  then  it  is  related  to  the 
state  vector  by  the  linear  relationship 


hi  *  "Sk^k  • 


(C-3) 


where  H  is  the  u  x  n  measurement  matrix 

and  is  a  u-component  vector  of  measurement  noise  which  has  a  Gaussian 

distribution  with  zero  mean  and  covariance 


M] 


(C-4) 


Here  R  is  a  positive  definite  u  *  u  matrix  and  6^  is  the  Kronecker  delta, 
j  A  measurement  which  originates  from  the  object  is  said  to  be  true,  while  all 
other  measurements  are  false.  A  false  measurement  is  assumed  to  be  independent 
of  the  state  vector,  to  have  a  uniform  distribution  over  the  surveillance  region 
of  the  senBor  and  to  be  independent  of  all  present  and  past  measurements.  False 
I  measurements  are  assumed  to  occur  at  an  average  density  of  p  per  unit  area. 

|  Further  it  is  assumed  that  before  examining  the  values  of  the  measurements  in 

|  the  set  2^  ,  there  is  no  information  on  which,  if  any,  of  the  measurements  are 
j  associated  with  the  object.  Note  that  if  the  identity  of  the  true  measurement 

j  were  known,  the  problem  would  reduce  to  that  of  the  standard  Kalman  filter* 
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The  tracking  problem  is  to  estimate  the  state  vector  at  each  time 
step,  based  on  the  available  information  up  to  and  including  time  t^  .  It  is 
assumed  that  4>,  r,  Q,  H,  R,  and  p  are  given,  together  with  all  the 
measurements. 

C.3  The  optimal  solution 

C.3.1  The  prior  distribution  of  the  state  vector  at  time  t^ 

The  prior  pdf  of  the  state  vector  at  time  t^  is  the  pdf  of  given 
all  available  information  up  to  time  t^  but  excluding  the  set  of  measurements 
received  at  time  t^.  This  available  prior  information  at  time  t^  is  denoted 
,  and  this  includes  all  measurements  received  at  the  previous  time  steps: 


Z .  *  Z* , 


Since  any  one  or  none  of  the  measurements  of  Z ,,  could  be  true,  there  are 
exactly  m.  +  1  exclusive  hypotheses  concerning  the  truth  or  falsehood  of  the 
members  of  Z.  .  Thus  the  total  number  of  possible  hypotheses  under  is 

•  K 


K  I 

vt  ■  TT(f  o  ’ 


(C-5) 


Therefore,  given  possible  hypotheses,  the  pdf  of  the  state  vector 

may  be  written 


(C-6) 


Here  £  denotes  one  of  the  possible  hypotheses  on  the  measurements  avail¬ 
able  under  p(skj<^k„t  is  the  pdf  of  assuming  £  is 

correct  and  is  given,  and  Pr|^_1  i|^*k  }  is  the  probability  that 

^k-1  i  *s  notreet  given  the  information  ^  . 

How  suppose  that  the  conditional  pdfs  in  the  RHS  of  (C-6)  are  known  to  be 
Gaussian,'  v 


p ( t)  *  {  *ki*  \i) 


<C-7) 
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Also  suppose  that  the  probabilities  of  the  hypotheses  are  known  and  are 


denoted 


Pr  l^k-1  i|^*k  }  *  6k-1  i 


(C-8) 


In  this  case  (C-6)  is  a  fully  specified  Gaussian  mixture  pdf.  Note  that  the 
above  suppositions  are  true  for  k  ■  1  . 

C.3.2  The  posterior  pdf  of  the  state  vector 

The  set  of  measurements  received  at  time  t^  is  to  be  used  to 
update  the  prior  pdf  of  x^  specified  by  (C-6)  to  (C-8).  The  resulting 
posterior  pdf  is  denoted 


In  the  following  working  we  shall  omit  fo*  ease  °£  notation,  although  the 

dependency  should  be  understood  for  all  conditional  probabilities  and  pdfs. 

Thus  the  posterior  pdf  of  x^  will  be  written 

*K|*k)  • 

After  updating  with  the  latest  set  of  measurements*  the  total  number  of  possible 
hypotheses  is  increased  to 

viK*1)  • 


This  increase  may  be  viewed  as  a  branching  process  where  each  of  the  l 

prior  hypotheses  of  (C-6)  may  be  seen  as  a  potential  track  and  each  of  these 
tracks  then  splits  into  a  further  ♦  1  tracks  resulting  from  the  new  set  of 
measurements.  Thus  a  posterior  hypothesis  including  the  latest  set  of  measure- 
menu  \  my  be  written  ...  joint  hypothesis 

*kij  ’  K-1i'\j)  • 

where  ^  is  independent  of  ,  and  indicates  that  the  jth  measurement 

of  set  2^  is  true  (or  that  they  are  all  false  if  k  ■  0).  The  complete  set  of 
posterior  hypotheses  is 

{^kij  5  *•  “  V  •••»  *  i  “  \  }  * 
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Hence  the  posterior  of  pdf  of  x^'  may  be  written  in  the  form 


P(^K)  •  L  •  <c-9> 

i-1  j*0 

First  consider  the  posterior  pdf  of  conditioned  by  ! 

>(\Kh’  O 

is  the  probability  density  resulting  from  updating  p^x  on  the 

assumption  that  the  jth  measurement  from  is  true  (for  j  j  0) .  In  this  case 
is  the  only  useful  measurement  from  Z^  and  the  other  members  of  Z^  can 
be  discarded  since  they  contain  no  relevant  information.  A  true  measurement 
has  a  Gaussian  distribution: 

and  the  prior  density  of  under  is  also  Gaussian  given  by  (C-7). 

Hence  the  required  posterior  density  is  also  Gaussian  and  is  given  by  the 
standard  Kalman  filter.  So  for  j  ^  0  , 


"Ukkij-  \)  •  '•  Kir  pkij) 

«ki  •  PUjHV'  •  f 

piij  ■  \i  ■  vXHi 

and  Sk,  -  HH^H1  ♦  R  . 

If  j  »  0  ,  none  of  the  members  of  Z^  arc  true  and  so  the  prior  pdf  is 
not  modified: 


Siio  *  ki 


(C-10) 


>ii0  -  \i  . 


(C-11) 
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Now  turning  to  the  second  term  in  the  summation  o£  equation  (C-9) ,  the 

posterior  probability  that  is  correct  may  be  evaluated  using  Bayes 

K1J 

theorem: 

The  equation  (C-12)  indicates  how  the  prior  probability  ^  }  is 

modified  by  the  observations  at  time  t^  .  The  posterior  probability  can  be 
found  by  evaluating  the  three  factors  in  the  numerator  of  the  RHS  of  (C-12). 

First  consider  )  •  This  ®ay  be  written 

“KKij)  ■  /p(v2kKij)^k  •  /»(\|V'*’kij)i,(4Kij)^k 


Since  the  elements  of  are  independent 


K 

p(zkiv‘a!'kii)  ■  rr^iv^ 


(C-13) 


A  measurement  z^  is  false  under  if  j  4  &  ,  False  measurements  are 

uniformly  distributed  over  the  surveillance  region  of  the  sensor,  and  so  the  pdf 

M  | 

of  a  false  measurement  is  ,  where  is  the  volume  of  the  surveillance 
region.  If  j  ■  &  ,  the  measurement  z^  is  true  and  so  is  a  sample  from  the 
Gaussian  distribution  defined  by  (C-3).  The  prior  pdf  of  , 


p(ik|‘rkij)  *  p-(5kK-l  i) 


which  is  the  Gaussian  pdf  (C-7).  Hence  ou  substituting  into  (C-13)  we  obtain, 
for  j  4  0  , 
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f  (\^m )  •  O"  js-(h j  >  "ik- R  W  st  *  ki-  K 


-ra.  +1 


Vk  ^(-kj  ;  H2ki*  Ski  ) 


(C— 14) 


where  is  defined  in  the  relations  CC— 10) .  Expression  (C-14)  is  strictly 

correct  only  for  a  surveillance  region  of  inifinite  extent.  However,  the 
truncation  effect  is  negligible  provided  that,  for  each  component  of  ,  the 

distance  from  Hx^  to  the  boundary  of  the  surveillance  region  is  large  compared 
with  the  standard  deviation  of  that  component.  If  j  ■  0  so  all  the  measure¬ 


ments  are  false. 


(MKio ) 


(C— 15) 


The  second  factor  in  the  numerator  of  (012)  is  the  prior  probability  of 


Hvl'wil  •  Prki} 


since  the  hypothesis  on  the  current  set  of  measurements  is  independent  of 
hypotheses  on  measurements  from  previous  time  steps.  The  only  prior  information 
available  is  the  probability  Pp  of  detecting  the  target  and  the  probability  of 
the  sensor  receiving  m  false  measurements.  If  false  measurements  are  uniformly 
distributed  over  the  measurement  space  with  density  o  ,  then  it  can  be  shown 
that  the  probability  of  m  fels"  measurements  falling  within  the  surveillance 
region  of  the  sensor  is  given  by  a  Poisson  distribution.  If  the  volume  of  the 
surveillance  region  is  ,  the  probability  of  receiving  m  false  measurements 
is  given  by 


K)"/-  • 


(C— 16) 


The  hypothesis  corresponds  to  the  event  of  failing  to  detect  the  target  and 
receiving  false  measurements*  The  prior  probability  of  this  occurrence  is 

Pc {\o }  *  (|-pd)8(\)  •  *  lc",7) 
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Any  of  the  hypotheses  j  j*  0  ,  could  correspond  to  the  situation  of  detect- 

KJ 

ing  the  target  and  receiving  tn^  -  1  false  measurements.  A  priori,  each  of 
these  hypotheses  is  equally  probable,  and  since  there  are  m^  of  them 
(for  j  j*  0) 


PrKj}  ■  pds(\-’)/\  • 


(C-18) 


The  third  factor  in  the  numerator  of  (C-12)  is  given  directly  by  (C-8): 


sk-1  i  * 


(C-19) 


Substituting  (C— 14)  to  (C-19)  into  (C-12)  we  obtain 


{Ku\\} 


8k-1  *  "Ski*  Ski) 


6k-1  i  (  '  -  PD  )  1 


for  j  ^  0 


for  j  •  0 


(C-20) 


where  D 


('  ~  pd) 


T<.~  1  K 

L  ^“1*  Z  J  H^kr*  \r) 


is  the  normilisting  denominator*  This  equation  is  of  key  importance  because  it 
defines  the  weightings  of  the  mixture  distribution  (C-9).  Note  that  if  *  1 
as  in  the  example  of  section  6,  knowledge  of  the  density  a  of  false  measure¬ 
ment  a  does  not  contribute  to  the  posterior  pdf. 

Thus  the  posterior  pdf  of  xfc  given  by  (C-9)  is  a  fully  specified 
Gaussian  mixture*  (C-9)  can  be  rewritten  as  a  single  sum  by  defining 
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where  l  - 


8U  -  *{Ku\\} 

(i  -  1) +  1  )  +  j  +  1 ,  for  i  =  1 ,  ....  nfc_1  and  j  »  0,  ^  . 


K 

p(^k|Zk)  "  ^  P  (^k \^U*  \)VX{^\s.l\\ }  (C_21) 


K 

where  ^  =  e^.,^  *  ')  “  J|  (mi  *  ’)  • 


<'(sk|*u"  *k)  '  lk»-  ?m) 

aad  Pt  {‘*ktK  1  ‘\t  • 

The  Gaussian  mixture  (C-2t)  contains  all  the  available  information  on  the  state 
vector  taking  account  of  the  latest  set  of  measurements  2^  »  Thus  in 

principle,  the  optimal  estimate  based  on  any  desired  criterion  may  be  obtained. 

In  particular  the  minimum  mean  square  error  estimate  is  the  mean  of  the 
distributions 

\ 

...  .  k  *  l  SAt  • 

82*  • 


However  a  single  value  of  x^  is  a  somewhat  inadequate  summary  of  a  mixture 
distribution,  especially  if  there  are  significant  well  spaced  components. 


To  establish,  by  induction,  the  general  property  that  the  prior  pdf  of 
x^  (equation  (06))  is  a  fully  specified  Gaussian  mixture,  it  is  necessary  to 
derive  the  pdf  of  x^,  from  the  result  (C-21) .  This  pdf  may  be  derived  from 
p  *^k)  ^nottt  is  reinstated  here)  via  the  propagation 

equation  (C-l) .  this  information,  together  with  2^  and  &  .  is  denoted 
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»  which  is  all  the  prior  information  available  at  time  t^+j  .  The  prior 
pdf  of  .  x^  may  be  written 


p  (St+tl^k+l )  "  '  <C'22) 

p(5k+l|2k)  is  defined  by  the  state  propagation  equations,  and  the  second  term 

p  (  -k|^*k+1  )  ■  p(ik|Zk'^k) 


since  the  extra  information  on  state  propagation  from  t^  to  t^+j  does  not 
contribute  to  the  pdf  of  state  at  tfc  .  Substituting  (C— 2 1 )  into  (C-22)  and 
performing  the  integrations  gives 


k 

p  (  ^k+1  |^k+1  )  =  ^  Pr  {^kil^k+1  }  p(  2k+l|^ki*  ^k+1  )  (C-23) 

4-1 


where 


Pr{^k£|^k+1  )  = 

and  ■  ^(.v,  *  v,  v,  t) 


with 


and 


-k+1  t  ■ 


Vt  t  *  ^ 


The  pdf  (C-23)  is  of  the  same  form  as  (C-6) {  it  is  a  fully  specified  Gaussian 
mixture.  Hence  the  initial  supposition  of  section  C.3.!  is  proved  by  induction, 

C.4  Discussion 

It  has  been  shown  that  the  posterior  pdf  of  the  state  vector,  just  after 
incorporating  the  latest  set  of  measurements,  is  a  Gaussian  mixture  given  by 
equation  (C-21),  This  equation  is  a  complete  description  of  the  filter's  know¬ 
ledge  of  the  state  vector.  Each  component  of  the  mixture  represents  a  potential 
track  and  is  a  Kalman  filter  estimate  of  the  state  vector  based  on  a  possible 
history  of  true  and  false  measurements.  At  time  t^  ,  the  components 
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represent  all  feasible  track  histories.  The  weighting  8^  is  the  probability 
that  track  history  l  is  the  correct  one. 

For  most  interesting  cases,  the  number  of  components  n^  becomes  very 
large  with  increasing  k  (see  (C-21)).  Since  every  component  must  be 
propagated  at  each  time  step,  implementation  of  the  optimal  solution  is 
impractical,  hence  the  need  for  the  reduction  algorithms  which  are  the  subject 
of  this  Report. 
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Appendix  D 

THE  COARSE  ACCEPTANCE  CASE 

A  coarse  acceptance  test  is  applied  to  the  sensor  measurements  to  reject 
any  hypothesis  that  appears  to  be  very  unlikely  on  the  basis  of  prior  informa¬ 
tion.  This  test  is  computationally  inexpensive  as  the  unlikely  hypotheses  are 
rejected  before  their  corresponding  posterior  mixture  components  need  be 
evaluated.  Hopefully  the  effect  of  this  acceptance  test  on  the  posterior 
distribution  will  be  insignificant.  The  mixture  reduction  algorithm  is  applied 
after  the  posterior  mixture  distribution  has  been  compiled. 

Each  component  of  the  posterior  pdf  of  the  state  vector  is  generated  by 

updating  a  feasible  track  from  the  prior  pdf  with  either  a  received  measurement, 

or  by  prediction  on  the  assumption  that  all  received  measurements  are  false 

(see  Appendix  C,  section  C.3.2),  Consider  the  prior  track,  or  component  i  of 

(C-6),  that  corresponds  to  hypothesis  ^  •  Under  hypothesis 

(j  4  0) ,  measurement  z.  .  is  true  and  is  used  to  update  prior  component  i  . 

“kj 

From  (C-14),  the  prior  pdf  of  z  .  under  .  (j  /  0)  is  given  bv 


Jr(  {  \i  )  • 


From  knowledge  of  this  distribution,  an  acceptance  or  validation  region  in  the 
measurement  space  may  be  defined,  such  that  under  hypothesis  i  »  t^'c 

probability  of  the  true  measurement  falling  outside  the  region  is  vory  small. 
(This  type  of  acceptance  test  is  commonly  applied  to  measurement-track  associa¬ 
tion  problems  where  ambiguities  may  exist  -  see  Refs  2,  9  and  It.)  If  the 
validation  region  is  chosen  so  that  the  probability  density  of  the  true 
measurement  at  any  point  within  the  region  exceeds  that  at  all  points  outside 
the  region,  then  the  acceptance  region  is  bounded  by  a  hyperellipsoid.  Thus  a 


measurement 


Is  accepted  for  updating  hypothesis  £ 


if  and  only  if 


f 


» 

I 


Note  that  since  the  false  measurements  have  a  uniform  distribution,  this  is 
equivalent  to  subjecting  each  measurement  to  a  likelihood  ratio  test.  For  a 
true  measurement  ,  under  hypothesis  ^  ,  the  LHS  of  (D-1)  is  a  sample 

from  a  distribution  with  degrees-of-freedom  equal  to  the  dimension  of 
Thus  the  value  of  corresponding  to  a  probability  a  of  missing  the  true 
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measurement  (if  the  object  is  defected  and  £  is  correct)  may  be  obtained 

from  tables  of  x"  .  In  the  simulation  of  section  6,  a  is  set  to  0.001,  which 
corresponds  to  T.  =  13.82  for  two-dimensional  measurement  space.  Note  that  a 

A 

different  acceptance  region  must  be  defined  for  each  component  of  (C-6).  To 
take  account  of  the  possibility  of  rejecting  the  true  measurement,  the  detec¬ 
tion  probability  should  be  replaced  by  P^(1  -  a)  .  Thus  even  if  PQ  =  1  , 

a  component  is  generated  for  the  finite  probability  of  missing  the  true 


measurement. 
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